Skip to content

Conversation

@Molter73
Copy link
Collaborator

Description

Some improvements have been made to our ansible playbooks, namely:

  • Remove us-central1-c zone, since this one doesn't have Arm VM support.
  • Update ansible to use the newer break_when loop control, this is closer to our original intent.
  • Ensure the compute instance is removed when creation fails.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Testing Performed

CI should be enough.

This zone does not have support for Arm VMs, which is causing it to fail
the entire workflow when we rollover to it.
This is a newer approach to break out of loops in ansible which is
closer to what we originally intended.
@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 28.83%. Comparing base (c467e0e) to head (8acdf76).
⚠️ Report is 23 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2358   +/-   ##
=======================================
  Coverage   28.83%   28.83%           
=======================================
  Files          96       96           
  Lines        5799     5799           
  Branches     2551     2551           
=======================================
  Hits         1672     1672           
  Misses       3408     3408           
  Partials      719      719           
Flag Coverage Δ
collector-unit-tests 28.83% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Molter73 Molter73 force-pushed the mauro/ansible/miscellaneous-resilience-improvements branch from df4c39b to b42cb48 Compare August 18, 2025 15:12
- us-central1-a
- us-central1-b
- us-central1-c
- us-central1-f
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- us-central1-f
- us-central1-f
- us-central1-d

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ gcloud compute machine-types list --filter="name=t2a-standard-2 AND zone~'^us-'" --format="value(zone)"
us-central1-a
us-central1-b
us-central1-f
us-central1-d

@Molter73
Copy link
Collaborator Author

So the changes in this PR are not really doing what I intended them to do.

The current state on master is that ansible will create a VM on us-central1-a attempt to start it and fail at this point, leaving the VM created and causing the subsequent zones to fail.

The intention of this PR was to attempt to remove the VM once it fails, this can be achieved with a block, but you can't combine loops and blocks in ansible, so I created a separate file with tasks and attempted to include that one in a loop, when doing this, the task will be added once per zones and run concurrently, which means whichever creates the VM first will be used and all others will fail. I believe this will not solve our initial problem, since the actual error happens when the VM is attempted to be started and can still cause the same issue we are seeing.

@Stringy Stringy marked this pull request as ready for review September 4, 2025 08:54
@Stringy Stringy requested a review from a team as a code owner September 4, 2025 08:54
@Stringy Stringy enabled auto-merge (squash) September 4, 2025 09:23
@Stringy Stringy merged commit db79889 into master Sep 4, 2025
75 of 84 checks passed
@Stringy Stringy deleted the mauro/ansible/miscellaneous-resilience-improvements branch September 4, 2025 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants